Data Science for Public Policy
Source: Urban Institute
Expanded Perspectives on Data and Methods
Modern Tools for Learning about Data
The millions of tweets shared on Twitter daily are a rich resource of public sentiment on countless topics. In the wake of highly publicized officer-involved shootings, many people take to social media to express their opinions, both positive and negative, of the police. We collected millions of public tweets and employed machine learning to explore whether we can measure public sentiment toward the police. Specifically, we examine how public sentiment changed over time and in response to one high-profile event, the 2015 death of Freddie Gray in Baltimore. While accounting for the larger trends in the public image of the police on Twitter, we find that sentiment became significantly more negative after Gray’s death and during the subsequent protests.
This report uses cluster analysis to sort people into cohesive groups, based on their responses to 23 questions covering an array of political attitudes and values. First developed in 1987, the Pew Research Center’s Political Typology has provided a portrait of the electorate at various points across five presidencies; the last typology study was released in May 2011.
Source: Network Propaganda Explored
Lead poisoning is a major public health problem that affects hundreds of thousands of children in the United States every year. A common approach to identifying lead hazards is to test all children for elevated blood lead levels and then investigate and remediate the homes of children with elevated tests. This can prevent exposure to lead of future residents, but only after a child has been poisoned. This paper describes joint work with the Chicago Department of Public Health (CDPH) in which we build a model that predicts the risk of a child to being poisoned so that an intervention can take place before that happens. Using two decades of blood lead level tests, home lead inspections, property value assessments, and census data, our model allows inspectors to prioritize houses on an intractably long list of potential hazards and identify children who are at the highest risk. This work has been described by CDPH as pioneering in the use of machine learning and predictive analytics in public health and has the potential to have a significant impact on both health and economic outcomes for communities across the US.
Predictive Modeling for Public Health: Preventing Childhood Lead Poisoning
Disaggregating data by race and ethnicity is a critical method for shining light on racialized systems of privilege and oppression. Imputation is a powerful tool for disaggregating data by generating racial and ethnic identifiers onto datasets lacking this information. But if used without a proactive focus on equity, it can harm Black people, Indigenous people, and other people of color.
Source: “Machine Learning and Causal Inference for Policy Evaluation” by Susan Athey